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Abstract: We study the properties of the constrained minimal supersymmetric standard 
model (mSUGRA) by performing fits to updated indirect data, including the relic density 
of dark matter inferred from WMAP5. In order to find the extent to which ^ < is dis- 
favoured compared to ^ > 0, we compare the Bayesian evidence values for these models, 
which we obtain straightforwardly and with good precision from the recently developed 
multi-modal nested sampling ('MultiNest') technique. We find weak to moderate evi- 
dence for the /i > branch of mSUGRA over /i < and estimate the ratio of probabilities 
to be P(/u > 0)/P(/i < 0) = 6— 61 depending on the prior measure and range used. There is 
thus positive (but not overwhelming) evidence that /u > in mSUGRA. The MultiNest 
technique also delivers probability distributions of parameters and other relevant quantities 
such as superpartner masses. We explore the dependence of our results on the choice of 
the prior measure used. We also use the Bayesian evidence to quantify the consistency 
between the mSUGRA parameter inferences coming from the constraints that have the 
largest effects: {g — 2)^, BR{h -^ 57) and cold dark matter (DM) relic density Odm/i^- 
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1. Introduction 

The impending start of operation of the Large Hadron Collider (LHC) makes this a very 
exciting time for supersymmetric (SUSY) phenomenology. Numerous groups have been 
pursuing a programme to fit simple SUSY models and identify the regions in the parame- 
ter space that might be of interest with the forthcoming LHC data [Q, ^, [^, ^, [5[, ^. The 
Minimal Supersymmetric Standard Model (MSSM) with one particular choice of universal 
boundary conditions at the grand unification scale, called either the Constrained Minimal 
Supersymmetric Standard Model (CMSSM) or mSUGRA Q, has been studied quite ex- 
tensively in multi-parameter scans. mSUGRA has proved to be a popular choice for SUSY 
phenomenology because of the small number of free parameters. In mSUGRA, the scalar 
mass rriQ, gaugino mass M1/2 and tri-linear coupling Aq are assumed to be universal at a 
gauge unification scale Mgut ~ 2 x 10^^ GeV. In addition, at the electroweak scale one 
selects tan/3, the ratio of Higgs vacuum expectation values and sign(/i), where fi is the 
Higgs/higgsino mass parameter whose square is computed from the potential minimisation 
conditions of electroweak symmetry breaking (EWSB) and the empirical value of the mass 



of the Z^ boson, Mz- The family universahty assumption is well motivated since flavour 
changing neutral currents are observed to be rare. Indeed several string models (see, for 
example Ref. ^) predict approximate MSSM universality in the soft terms. Nevertheless, 
mSUGRA is just one (albeit popular) choice among a multitude of possibilities. 

Recently, Bayesian parameter estimation techniques using the Markov Chain Monte 
Carlo (MCMC) sampling have been applied to the study of mSUGRA, performing a multi- 
dimensional Bayesian fit to indirect constraints |^, [l^, 11, 12, 1^, 14, ^, 16|. Also, a study 



has been extended to large volume string compactified models |17[| . A particularly impor- 
tant constraint comes from the cold dark matter (DM) relic density JIdm^^ determined by 
the Wilkinson Microwave Anisotropy Probe (WMAP). DM is assumed to consist solely of 



the lightest supersymmetric particle (LSP). As pointed out in |12], the accuracy of the DM 
constraint results in very narrow steep regions of degenerate x^ minima as the system is 
rather under-constrained. This makes the global fit to all the relevant mSUGRA parame- 
ters potentially difficult. If the MSSM is confirmed in the forthcoming collider data, it will 
hopefully be possible to break many of these degeneracies using collider observables such as 
edges in kinematical distributions. However, it is expected that one degeneracy will remain 
from LHC data in the form of the overall mass scale of the sparticles. We apply the newly 
developed MultiNest technique [^, |l£] to explore this highly degenerate parameter space 
efficiently. With this technique, one can also calculate the 'Bayesian evidence' which plays 
the central role in Bayesian model selection and hence allows one to distinguish between 
different models. 

Ref. ^^ performed a random scan of 10^ points in the parameter spaces of mSUGRA, 
minimal anomaly mediated SUSY breaking (mAMSB) and minimal gauge mediated SUSY 
breaking (mGMSB). b and electroweak physics observables (but not the dark matter relic 
density) were used to assign a x^ to each of the points. The resulting minimum x^ values 
for each scenario were then compared in order to select which model is preferred by the 
data. Unfortunately, the conclusions drawn (that mAMSB is preferred by data) may have 
been reversed had the dark matter relic density been included in the x"^ fit. It is also not 
clear how accurate the resulting value of minimum x^ is in each scenario, since the scans 
are necessarily sparse due to the high dimensionality of the parameter space^. Recently, 
several studies of the mSUGRA parameter space have used Markov Chain Monte Carlo 
in order to focus on the joint analysis of indirect constraints from experiment with the 
^DM^^ constraint as determined by WMAP and other data. We extend this approach by 
using MultiNest to calculate the Bayesian evidence, which, when compared with fits to 
different models, can be used for hypothesis testing. As an example, we consider ^ > 
mSUGRA versus /U < mSUGRA as alternative hypotheses. In Ref. |jl^, the evidence 
ratio for these two quantities was calculated using the method of bridge sampling [21| in 



MCMCs. However, it is not clear how accurate the estimation of the evidence ratio was, and 
no uncertainties were quoted. The present approach yields, robustly small uncertainties 



on the ratio, for a given hypothesis and prior probability distribution. Since Ref. |12| 



a tension has developed between the constraints coming from the anomalous magnetic 

^However, this point could be easily fixed by the authors of Ref. Ml by separating the points randomly 
into two equally sized samples and examining the x^ difference of the minimum point in each. 



moment of the muon (g — 2)^ , and the branching ratio of the decay of b quarks into s quarks 



BR(b -^ 57), which favour opposite signs of ;U |14|. Ref. |14] investigated the constraints 
on continuous parameters for either sign of fx and used the Bayesian cahbrated p-value 
method |22] to get a rough estimate of the upper hmit for the evidence ratio between 



/x > mSUGRA and fj, < mSUGRA of 10 : 1. We also use the evidence to examine 
quantitatively any incompatibilities between mSUGRA parameter inferences coming from 
three main constraints: {g — 2)^, BR{b — > 57) and Q.-ouih?'- Thus we determine to what 
extent the three measurements are compatible with each other in an mSUGRA context. 
We also update the fits to WMAP5 data for the first time and include additional 6-physics 
constraints. Recent data point to an increased statistical significance in the discrepancy 
between the Standard Model prediction and the experimental value of {g — 2)^, and this 
leads to an additional statistical pull towards a larger contribution of {g — 2)^ coming from 
sup er symmetry. 

Our purpose in this paper is two-fold: as well as producing interesting physical insights, 
we also aim to gain experience in developing and applying tools for efficient Bayesian 
inference, which will prove useful in the analysis of future collider data. 

This paper is organised as follows. In Section ^ we motivate the case for Bayesian model 
selection. In Section |3| we outline our theoretical setup and present our results in Section |^. 
Finally, in Section |5| we list the summary and present our conclusions. We motivate the 
case for the use of Bayesian evidence in quantifying consistency between different data-sets 
in Appendix ^. 

2. Bayesian Inference 

A common problem in data analysis is to use the data to make inferences about parameters 
of a given model. A higher level of inference is to decide between two or more competing 
models. For instance, in the case of mSUGRA, one would like to know whether there is 
sufficient evidence in the data to rule out the // < branch. Bayesian inference provides a 
consistent approach to model selection as well as to the estimation of a set parameters 
in a model (or hypothesis) H for the data D. It can also be shown that Bayesian inference 



is the unique consistent generalisation of the Boolean algebra [P3 |. 
Bayes' theorem states that 

P,(e|D.^). ^-'°l^;g]^;'«l^) . ,2.) 

where Y'i{@\Y)^H) = P(@) is the posterior probability distribution of the parameters, 
Fr(D\&,H) = C{@) is the likelihood, Pr(0|if) = 7r(0) is the prior distribution, and 
Pr(D|^) = ^ is the Bayesian evidence. 

Bayesian evidence is simply the factor required to normalise the posterior over and 
is given by: 

Z= f C{e)Tr{@)d^@, (2.2) 



where A^ is the dimensionality of the parameter space. Since the Bayesian evidence does not 
depend on the parameter values 0, it is usually ignored in parameter estimation problems 



and the posterior inferences are obtained by exploring the un-normahzed posterior using 
standard MCMC sampling methods. 

A useful feature of Bayesian parameter estimation is that one can easily obtain the 
posterior distribution of any function, /, of the model parameters 0. Since, 

Pr(/|D)= /"pr(/,0|D)(ie = /"pr(/|0,D) Pr(G|D)d0 = /5(/(0) - /) Pr(0|D)(i0 

(2.3) 
where 5{x) is the delta function. Thus one simply needs to compute /(0) for every Monte 
Carlo sample and the resulting sample will be drawn from Pr(/|D). We make use of this 



feature in Section 4.2 where we present the posterior probability distributions of various 
observables used in the analysis of mSUGRA model. 

In order to select between two models Hq and Hi one needs to compare their respective 
posterior probabilities given the observed data set D, as follows: 

Pr(iJi|D) Pr(D|Fi)Pr(iJi) ZiFi{Hi) 



(2.4) 



Pr(/7o|D) Pr(D|i7o)Pr(i/o) ^o Pr(^o) 
where Pr(Hi)/ Ft{Hq) is the prior probability ratio for the two models, which can often 



be set to unity but occasionally requires further consideration. It can be seen from Eq. ^^ 
that the Bayesian evidence takes the center stage in Bayesian model selection. As the 
average of likelihood over the prior, the Bayesian evidence is higher for a model if more 
of its parameter space is likely and smaller for a model with highly peaked likelihood but 
has many regions in the parameter space with low likelihood values. Hence, Bayesian 
model selection automatically implements Occam's razor: a simpler theory which agrees 
well enough with the empirical evidence is preferred. A more complicated theory will only 
have a higher evidence if it is significantly better at explaining the data than a simpler 
theory. 

Unfortunately, evaluation of the multidimensional integral (p^) is a challenging numer- 



ical task. Standard techniques like thermodynamic integration |24| are extremely computa- 
tionally expensive which makes evidence evaluation typically at least an order of magnitude 
more costly than parameter estimation. Some fast approximate methods have been used 
for evidence evaluation, such as treating the posterior as a multivariate Gaussian centred 



at its peak (see e.g. Ref. [25|), but this approximation is clearly a poor one for multi- 



modal posteriors (except perhaps if one performs a separate Gaussian approximation at 



each mode). The Savage-Dickey density ratio has also been proposed [26| as an exact, and 



potentially faster, means of evaluating evidences, but is restricted to the special case of 
nested hypotheses and a separable prior on the model parameters. Bridge sampling |21| 
allows the evaluation of the ratio of Bayesian evidence of two models and is implemented 



in the 'bank sampling' method of Ref. [27| but it is not yet clear how accurately bank 
sampling can calculate these evidence ratios. Various alternative information criteria for 
model selection are discussed by [^, but the evidence remains the preferred method. 

The nested sampling approach, introduced by Skilling ||2^, is a Monte Carlo method 
targeted at the efficient calculation of the evidence, but also produces posterior inferences 



as a by-product. Feroz & Hobson [18, |^ built on this nested sampling framework and 



have recently introduced the MultiNest algorithm which is efficient in sampling from 
multi-modal posteriors exhibiting curving degeneracies, producing posterior samples and 
calculating the evidence value and its uncertainty. This technique has greatly reduced the 
computational cost of model selection and the exploration of highly degenerate multi-modal 
posterior distributions. We employ this technique in this paper. 

The natural logarithm of the ratio of posterior model probabilities provides a useful 
guide to what constitutes a significant difference between two models: 



log AE = log 



Pr(Fi|D) 
Pr(i/o|D) 



log 



Zi Pr(iJ-i) 
Zo Pt{Ho) 



(2.5) 



We summarise convention we use in this paper in Table ||. 



1 log A^l 


Odds 


Probability 


Remark 


< 1.0 


<3:1 


< 0.750 


Inconclusive 


1.0 


~3: 1 


0.750 


Weak Evidence 


2.5 


~12: 1 


0.923 


Moderate Evidence 


5.0 


~ 150 : 1 


0.993 


Strong Evidence 



Table 1: The scale we use for the interpretation of model probabilities. Here the log represents 
the natural logarithm. 



While for parameter estimation, the priors become irrelevant once the data are pow- 
erful enough, for model selection the dependence on priors always remains (although with 
more informative data the degree of dependence on the priors is expected to decrease, see 
e.g. Ref. |@); indeed this explicit dependence on priors is one of the most attractive fea- 
tures of Bayesian model selection. Priors should ideally represent one's state of knowledge 
before obtaining the data. Rather than seeking a unique 'right' prior, one should check 
the robustness of conclusions under reasonable variation of the priors. Such a sensitivity 
analysis is required to ensure that the resulting model comparison is not overly dependent 
on a particular choice of prior and the associated metric in parameter space, which controls 
the value of the integral involved in the computation of the Bayesian evidence (for some 
relevant cautionary notes on the subject see Ref. [pl| ). 

One of the most important applications of model selection is to decide whether the 
introduction of new parameters is necessary. Frequentist approaches revolve around the 
significance test and goodness-of-fit statistics, where one accepts the additional parameter 
based on the improvement in Ax^ by some chosen threshold. It has been shown that such 
tests can be misleading (see e.g. 



Ref. [26 



I), not least because they depend only on 
the values of x^ ^^ th^ best-fit point, rather than over the entire allowed range of the 
parameters. 

Another application of Bayesian model selection is in quantifying the consistency be- 

|]. Different experimental observables 



tween two or more data sets or constraints 25 



may "pull" the model parameters in different directions and consequently favour different 
regions of the parameter space. Any obvious conflicts between the observables are likely 
to be noticed by the "chi by eye" method employed to date but it is imperative for forth- 



coming high-quality constraints to have a method that can quantify these discrepancies. 
The simplest scenario for analysing different constraints on a particular model is to as- 
sume that all constraints provide information on the same set of parameter values. We 
represent this hypothesis by Hi. This is the assumption which underlies the joint analy- 
sis of the constraints. However, if we are interested in accuracy as well as precision then 
any systematic differences between constraints should also be taken into account. In the 
most extreme case, which we represent by Hq, the constraints would be in conflict to such 
an extent that each constraint requires its own set of parameter values, since they are in 
different regions of parameter space. Bayesian evidence provides a very easy method of 



distinguishing between scenarios, Hq and Hi. To see this, we again make use of Eq. |2.4 . 
If we have no reason to favour either of Hq or Hi over the other, then we can distinguish 
between these two scenarios using the following ratio, 

^ Pr(D|gi) ^ Pr(D|gi) 

Here the numerator represents the joint analysis of all the constraints D = {Di,D2, . . . , Dn} 
while in the denominator the individual constraints Di , D2 , ■ ■ ■ , Dn are assumed to be in- 
dependent and are each fit individually to mSUGRA, with a different set of mSUGRA 
parameters for each Di. The interpretation of the log-R value can be made in a similar 
manner to model selection, as discussed in the preceding paragraph. A positive value of 
log R gives the evidence in favour of the hypothesis Hi that all the constraints are consistent 
with each other while a negative value would point towards tension between constraints, 
which prefer different regions of mSUGRA parameter space. We follow this recipe to carry 
out consistency checks for the mSUGRA model between [g — 2)^, BR{h — > 57) and JIdm^i^ 
as determined by WMAP and other cosmological measurements. The Hi hypothesis thus 
states that mSUGRA jointly fits these three observables, whereas Hq states that they all 
prefer different regions of parameter space and so we require an '(mSUGRA)^' model to 
fit them. Given the fact that Bayesian evidence naturally embodies a quantification of 
Occam's razor, the resulting complexity in the model coming from the additional 2 sets of 
mSUGRA parameters must be matched by a better fit to data for Hq to be preferred. 

3. The Analysis 

Our parameter space contains 8 parameters, 4 of them being the mSUGRA parameters; 
rriQ, Af^rt, ^0) tan/3 and the rest taken from the Standard Model (SM): the QED coupling 
constant in the MS scheme a {Mz), the strong coupling constant a^ (Mz), the running 
mass of the bottom quark mb{mb)^^'^ and the pole top mass mt- We refer to these SM 
parameters as nuisance parameters. Experimental errors on the mass Mz of the Z^ boson 
and the muon decay constant G^ are so small that we fix these parameters to their central 
values of 91.1876 GeV and 1.16637 x 10"^ GeV"^ respectively. 



For all the models analysed in this paper, we used 4,000 live points (see Refs. [18, |T9| ) 
with the MultiNest technique. This corresponds to around 400,000 likelihood evaluations 
taking approximately 20 hours on 4 3.0 GHz Intel Woodcrest processors. 



3.1 The Choice of Prior Probabihty Distribution 

In all cases, we assume the prior is separable, such that 

7r(0)=7r(0i)7r(e2)...vr(08), 



(3.1) 



where 7r(^i) represents the prior probability of parameter 6i. We consider two initial ranges 
for the mSUGRA parameters which are listed in Table 0. The "2 TeV" range is motivated 



mSUGRA parameters 


2 TeV range 


4 TeV range 


mo 


60 GeV to 2 TeV 


60 GeV to 4 TeV 


Ml/2 


60 GeV to 2 TeV 


60 GeV to 4 TeV 


Ao 


-4 TeV to 4 TeV 


-7 TeV to 7 TeV 


tan/? 


2 to 62 


2 to 62 



Table 2: mSUGRA uniform prior parameter ranges 

by a general "naturalness" argument that SUSY mass parameters should lie within 0(1 
TeV), since otherwise a fine-tuning in the electroweak symmetry breaking sector results. 
Deciding which region of parameter space is natural is obviously subjective. For this reason, 
we include the "4 TeV" range results to check the dependence on prior ranges. We consider 
the branches ^ < and ^ > separately. 



SM parameters 



1/a*^^ 

MS 



a 



mb(mb 
mt 



Mean value 



Uncertainty 
a (exp) 



127.918 

0.1176 

4.20 GeV 

170.9 GeV 



0.018 

0.002 

0.07 GeV 

1.8 GeV 



Reference 



m 
m 

m 
m 



Table 3: Constraints on the Standard Model (nuisance) parameters 

We impose flat priors on all 4 mSUGRA parameters (i.e. mQ,Mi/2,AQ and tan/3) for 
the "2 TeV" and "4 TeV" ranges and both signs of ^u. Current constraints on SM (nuisance) 
parameters are listed in Table |^. With the means and la uncertainties from Table |3|, we 
impose Gaussian priors on SM (nuisance) parameters truncated at 4a from their central 
values. We also perform the analysis for flat priors in logr«o and log Mi /2 for both ranges 
and both signs of /j.. Since, 



/dlogmo dlogMi/2 p(mo,Mi/2|D) = /dmo dMy, 



p(mo,Mi/2|D) 
moMi/2 



(3.2) 



^We note that the experimental constraint on mt is changing quite rapidly as new results are issued from 
the Tevatron experiments. The latest combined constraint (released after this paper was first written) is 
nit = 172.4 ± 1.2 GeV [B5|. Any fit differences caused in the movement of the central value will be smeared 
out by its uncertainty, but we shall mention at the relevant point below where the new value could change 
the fits. 



it is clear that the logarithmic prior measures have a factor l/(moMx/2) compared to 
the linear prior measure and so it could potentially favour lighter sparticles. If the data 
constrains the model strongly enough, lighter sparticles would only be favoured negligibly. 
Our main motive in seeing the variation of the fit to the variation in prior measure is to 
check the dependence of our results on the choice of the prior. For robust fits, which occur 
when there is enough precisely constraining data, the posterior probability density should 
only have a small dependence upon the precise form of the prior measure. 

3.2 The Likelihood 



Observable 



6af, X 10-10 



Mw 



sin2 el, 

BR{b -^ s-f) X 10^ 



Mean value Uncertainty 



29.5 

80.398 GeV 

0.23149 

3.55 

0.0375 

1.259 

0.85 



27MeV 
0.000173 

0.72 
0.0289 

0.378 

0.12 



Reference 



13 



3391,^] 

MM 
©1] 
i2i 



Bi: 



Table 4: Summary of the Gaussian distributed observables used in the analysis. For each quantity 
we use a likelihood function with central mean fi and standard deviation s = \/(j^ + t^ where a 
is the experimental uncertainty and r is the theoretical uncertainty. Ao- represents the isospin 
asymmetry oi B —^ K*j. Rbr{Bu^tu) represents the ratio of the experimental and SM predictions 
of the branching ratio of i?„ mesons decaying into a tau and a tau neutrino. i?A„ is the ratio of 
the experimental and the SM neutral Bg meson mixing amplitudes. The non-Gaussian likelihoods 
for the LEP constraint on Higgs mass, BR{Bs -^ fi^fi^) and riDM^i^ are described later. 



Our calculation of the likelihood closely follows Ref. ||15|, with updated data and ad- 
ditional variables included, and is summarised in Table ^ and discussed further below. We 
assume that the measurements Di of observables (the 'data') used in our likelihood calcu- 
lation are independent and have Gaussian errors^, so that the likelihood distribution for a 
given model {H) is 

C{@) = Pr(D|e,F) = Y[Fv{Di\&,H), (3.3) 



where 



and 



Ft{D,\@,H) 



X 



: exp[-xV2] 



2TTai 



(cj - Pif 



ct 



(3.4) 



(3.5) 



Pi is the "predicted" value of the observable i given the knowledge of the model H and a 
is the standard error of the measurement. 



^The experimental constraints the LEP constraint on Higgs mass, BR{Bs 
hood, each described later, are not Gaussian. 



^ M 



and fl-DMh^ likeli- 



In order to calculate predictions pi for observables from the input parameters 0, 



S0FTSUSY2 .0.17 Q is first employed to calculate the MSSM spectrum. Bounds upon 
the sparticle spectrum have been updated and are based upon the bounds collected in 
Ref. 1 11]. Any spectrum violating a 95% limit from negative sparticle searches is assigned 



a zero likelihood density. Also, we set a zero likelihood for any inconsistent point, e.g. one 
which does not break electroweak symmetry correctly, has a charged LSP, or has tachyonic 
sparticles. For points that are not ruled out, we then link the mSUGRA spectrum via the 
SUSY Les Houches Accord ||4^ (SLHA) to various other computer codes that calculate 
various observables. For instance, micrOMEGAsl .3.6 [^], calculates ^DMh"^, the branching 
ratio BR{Bs -^ fi^fj,^) and the anomalous magnetic moment of the muon {g — 2)^. 

The anomalous magnetic moment of the muon a^ = (g — 2)^/2 was measured to be 
a)fP = (11659208.0 ± 6.3) x 10"^° Q. Its experimental value is in conflict with the SM 



,SM _ /iiRc^oiTe K 4- « 1 ^ V in-lO 



predicted value a^^^ = (11659178.5 ± 6.1) x 10"^" from |3§, which includes the latest 



QED [48 1, electroweak |4^, and hadronic ||3^ contributions to a^ . This SM prediction 
does not however account for r data which is known to lead to significantly different 
results for a^, implying underlying theoretical difficulties which have not been resolved so 
far. Restricting to e'^e" data, hence using the numbers given above, we find 



2 f^ M M 



= 6a^ = a^^P - af^ = (29.5 ± 8.8) x lO^^^. (3.6) 



This excess may be explained by a super symmetric contribution, the sign of which is 
identical in mSUGRA to the sign of the super potential ^ parameter pO[. After obtaining 
the one-loop MSSM value of {g — 2)^ from micrOMEGAs vl.3.6, we add the dominant 
2-loop corrections detailed in Refs. [£l], |5^ . 

The W boson pole mass Myy and the effective leptonic mixing angle sin^ 9\^ are also 
used in the likelihood. We take the measurements to be |^, |3^ 

Mw = 80.398 ± 0.027 GeV, sin^ el = 0.23149 ± 0.000173, (3.7) 

where experimental errors and theoretical uncertainties due to missing higher-order correc- 



tions in SM ||5^ and MSSM [^, 54 1 have been added in quadrature. The most up to date 
MSSM predictions for Mw and sin^ 6^ |38] are finally used to compute the corresponding 



likelihoods. 

A parameterisation of the LEP2 Higgs search likelihood for various Standard Model 
Higgs masses is utilised, since the lightest Higgs h of mSUGRA is very SM-like once the 
direct search constraints are taken into account. It is smeared with a 2 GeV assumed 
theoretical uncertainty in the S0FTSUSY2.0. 17 prediction of m/j as described in [p!^]. 

The experimental value of the rare bottom quark branching ratio to a strange quark 
and a photon BR{b -^ sj) is constrained to be ||55f| 

Si?(6^s7) = (3.55 ±0.26) X 10"^. (3.8) 

The SM prediction has recently moved down quite substantially from (3.60 it 0.30) x 10~^ 
to (3.15 ± 0.23) X 10^^ [Q ^. This shift was caused by including most of the next-to- 
next-to-leading order (NNLO) perturbative QCD contributions as well as the leading non- 



perturbative and electroweak effects. We use the publicly available code Superlso2.0 |40| 



(linked via the SLHA to the mSUGRA spectrum predicted) which computes BR{b -^ sj) 
in the MSSM with Minimal Flavor Violation. We note that mSUGRA is of such a minimal 
flavor violating form, and so the assumptions present in Superlso2.0 are the appropriate 
ones. The computation takes into account one-loop SUSY contributions, as well as tan/3- 
enhanced two-loop contributions in the effective lagrangian approach. The recent partial 
NNLO SM QCD corrections are also included by the program. Ref. |4^ derives a 95% 
interval for the bounds including the experimental and theory SM/MSSM errors to be 

2.07 X 10""^ < BR{b -^ sj) < 4.84 x 10"^. (3.9) 

For the constraint on BR(h — > 57), we use the mean value of 3.55 x 10^^ and derive the 
1-0" uncertainty from the above given bound to be equal to 0.72 x 10~^. We note that 



this is twice as large as the uncertainty used in another recent global fit |14], where an 
enhancement in the posterior density of the large tan (3 region was observed to result from 
the new constraint. 

The new upper 95% C.L. bound on BR{Bs -^ fi~^fi~) coming from the CDF collabora- 
tion is 5.8x10"^. We are in possession p^ of the empirical x^ penalty for this observable 
as a function of the predicted value of BR{Bs — > n~^n~) from old CDF data when the 95% 
C.L. upper bound was 0.98x10"*^. Here, we assume that the shape of the likelihood penalty 
coming from data is the same as presented in Ref. ]I^, but that only the normalisation of 
the branching ratio shifts by the ratio of the 95% C.L. upper bounds: 0.58/0.98. 

For the Ao_, isospin asymmetry of i? — > K*j, the 95% confidence level for the exper- 
imental results from the combined BABAR and Belle data combined with the theoretical 
errors is [41]: 

-0.018 X 10"^ < Ao_ < 0.093 x 10~^ (3.10) 

with the central value of 0.0375. We derive the 1-a uncertainty from the above given 



bound to be equal to 0.0289. We use the publicly available code Superlso2.0 |4C] to 
calculate Ao_. We neglect experimental correlations between the measurements of Aq- 
and BR{b -^ sj). In practice, the Ao_ constraint makes a much smaller difference than 
BR{b — > S7) to our fits, and so we expect the inclusion of a correlation to also have a 
small effect. The parametric correlations caused by variations of as{Mz) and mi,{mb) are 
included by our analysis, since they are varied as input parameters. 

The average experimental value of BR{Bu — > tv) from HFAG [^] (under purely 
leptonic modes) is: 

BR'^^'^iBu -^ Tv) = (141 ± 43) X 10"^ (3.11) 

The SM prediction is rather uncertain because of two incompatible empirically derived 
values of |V^fe|: one being (3.68ib0.14) x 10~^ . The other comes from inclusive semi-leptonic 
decays and is (4.49±0.33) x 10"^ These lead to BR{Bu -^ tv) values of (0.85±0.13) x 10"^ 
and (1.39 it 0.44) x 10~^ respectively. We statistically average these two by averaging the 
central values, and then adding the errors in quadrature and dividing by v2- This gives: 

BR^^{Bu -^ Tv) = (112 ± 25) X 10"^ (3.12) 
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Taking the ratio of the experimental and SM values of BR{Bu -^ tv) gives: 



RBR{B^~.r.) = 1.259 ± 0.378. 



(3.13) 



For the MSSM prediction, we use the formulae in Ref. |59| ], which include the large tan/3 
limit of one-loop corrections coming from loops involving a charged Higgs. 

The experimental and SM-predicted values of the neutral Bg meson mixing amplitude 
are[||,||: 

A'^^Pm, = 17.77 ± 0.12 ps~\ A^^m^ = 20.9 ± 2.6 ps"^ (3.14) 



Taking the ratio of these two values, we get: 



-Ra^ 



0.85 ±0.12. 



(3.15) 



We use the formulae of Ref. |60| for the MSSM prediction of Ri\msi calculating it in 
the large tan/? approximation. The dominant correction comes from one-loop diagrams 
involving a neutral Higgs boson. 

The WMAP 5-year data combined 
with the distance measurements from 
the Type la supernovae (SN) and the 
Baryon Acoustic Oscillations (BAO) in 
the distribution of galaxies gives the A- 
cold dark matter fitted value of the dark 
matter relic density pi 



^ 


1 


C3 




£ 


75 


H 




^ 


0.5 




0.25 



mtiml 




O^i^DM/i^ = 0.1143 ±0.0034. (3.16) 



0.0343 



0.1143 



0.1943 



In the present paper, we assume 
that the dark matter consists of neu- 
tralino, the LSP. Recently, it has been 



Figure 1: Depiction of our likelihood constraint on 
the predicted value of r^DM^i^ due to lightest neutrali- 
nos, compared to a simple Gaussian with WMAP5 
shown that the LSP rehc density is highly central value and a la uncertainty of 0.02. 
sensitive to the pre-Big Bang Nucleosyn- 
thesis (BBN) rate and even a modest modification can greatly enhance the calculated relic 



density with no contradiction with the cosmological observations |62]. It is also possible 
that a non-neutralino component of dark matter is concurrently present and indeed the 
inclusion of neutrino masses via right-handed neutrinos can change the relic density pre- 



diction somewhat |63|. We therefore penalise only for the predicted J^dm^^ being greater 
than the WMAP5 ± BAO ± SN central value. We define x to be the predicted value of 
0dm /i^ c = 0.1143 to be the central Odm/^^ value from WMAP5 ± BAO ± SN obser- 
vations and s to be the error on the predicted Odm^^ value which includes theoretical as 
well as experimental components. We take s = 0.02 in order to incorporate an estimate of 
higher order uncertainties in its prediction ||63] and we define the likelihood as: 



C{x = Q.DMh'^) 



1 



1 

c+-^7rs2/2 



exp 



{x~cf 
2s2 



if X < C 

if X > c. 



(3.17) 
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A diagram of the resulting likelihood penalty is displayed in Fig. |T|. This differs slightly 
from the formulation suggested previously by one of the authors, for C{x = Odm^^) for 
the case when a non-neutralino component of dark matter is concurrently present, which 
drops more quickly than our flat likelihood up until the peak of WMAP Gaussian likelihood 
distribution. 



4. Results 

In this section, we first show our main results on the quantification of the preference of the 
fits for /_i > 0. Next, we show some highlights of updated parameter constraints coming 
from the fit, finishing with a study on the level of compatibility of various observables. 

4.1 Model Comparison 

We summarise our main results in Table || in which we list the posterior model probability 
odds, P+/P_ for mSUGRA models with /i > and /i < 0, for the two prior ranges used 
with flat and logarithmic prior measures as discussed in Section 0. The calculation of the 
ratio of posterior model probabilities requires the prior probability ratio for the two signs 
of jjL (see Section |2|), which we have set to unity. One could easily calculate the ratio P+/P_ 
for a different prior probability ratio r, by multiplying P+/P_ in Table g with r. From 
the probability odds listed in Table |5|, although there is a positive evidence in favour of 
mSUGRA model with ^ > 0, the extent of the preference depends quite strongly on the 
priors used and the evidence ranges from being relatively strong in the case of logarithmic 
prior with "2 TeV" range to weak for fiat priors with "4 TeV" range. This dependence on 
the prior is a clear sign that the data are not yet of sufficiently high quality to be able to 
distinguish between these models unambiguously. Hopefully, the forthcoming high-quality 
data from LHC would be able to cast more light on it. 



Prior 


"2 TeV" 


"4 TeV" 




flat 


log 


flat 


log 


log AE' (our determination) 


2.7±0.1 


4.1 ±0.1 


1.8 ±0.1 


3.2 ±0.1 


Pj^/P^ (our determination) 


15.6 ±1.1 


61.6 ±4.3 


5.9 ±0.4 


24.0 ±1.7 


logA^ (fromRef. @) 


2.1 


— 


1.8 


2.7 


P+/P- (from Ref. [||]) 


8.3 


- 


6.2 


14.3 



Table 5: The posterior probability ratios for mSUGRA model with different signs of /i. Here we 
have assumed the prior probabilities of the different signs of /x to be same. The uncertainties on 
log A£' for mSUGRA model with different signs of /i are the same for different priors, since with 
the MultiNest technique, the uncertainty on the evidence value is set by the number of live points 
and the stopping criteria (see Refs. [Q |l^) which were the same for different priors used in this 
study. The second row shows, for comparison, a previous determination with earlier data using the 
much less precise bridge sampling method. Some aspects of this fit were somewhat different to the 
present work's approach and are discussed in the text. 

We also show in Table g for comparison, the probability ratio P+/P_ determined in 
an earlier MCMC fit using different data |12|. We can see that our determination of the 
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probability ratio favours // > more strongly than Ref. [12|. The main factors affecting 



this are that Ref. |12] had an anomalous magnetic moment of the muon less in conflict 
with experiment: Sa^ = (22 ± 10) x 10~^° as opposed to Eq. ^ in the present analysis, 
which also includes the additional 6-observables: Arris, BR{Bu — > tv) and Aq-. Some 
other details of the fit were also different in Ref. |12|: for instance M]^/2 < 2 TeV for all 
fits, and the range of ^o was different. These ranges will affect the evidence obtained, at 
least to some degree. Unfortunately, Ref. |12| neglects to present statistical errors in the 
determination of the ratios of evidence values, a situation which is rectified in Table |5|. 
It is clear from Table |^ that the uncertainty in the result of the model comparison is 
presently dominated by the prior choice, rather than by the small statistical uncertainty 
in the determination of the evidence ratio with MultiNest. It can however be concluded 
that present data favour the /i > branch of mSUGRA with a Bayesian evidence going 
from weak to moderate, depending on the choice of prior. 

To quantify the extent to which these results depend on (g— 2)^ constraint, we calculate 
the Bayesian evidence ratio, for mSUGRA models with /x > and /i < 0, for the flat "4 
TeV" range priors with all the observables discussed in Section 3.2 apart from {g — 2)^. We 



find log AE' = —0.5 it 0.1 translating into posterior probability odds Pj^/P- = 0.6 it 1.1. 
This shows that in the absence of {g — 2)^ constraint, both mSUGRA models with /j, > 
and /x < are equally favoured by the data. Inclusion of {g — 2)^ constraint causes a shift 
of 2.3 log units in favour of /i > for the linear "4 TeV" range prior measure and hence it 



can be concluded that {g 
of /x > 0. 



2)^ does indeed dominate our model selection results in favour 



4.2 Updated Parameter Constraints 
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Figure 2: The 2-dimensional posterior probability densities in the plane spanned by mSUGRA 
parameters: rrro, rni/2, Aq and tan/3 for the linear prior measure "4 TeV" range analysis and /i > 0. 
The inner and outer contours enclose 68% and 95% of the total probability respectively. All of the 
other parameters in each plane have been marginalised over. 
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We display the results of the MultiNest fits on the mo — M1/2 and mo — tan/3 
plane posterior probability densities in Fig. ^ ^. Previous global fits in mSUGRA have 
found that the dark matter relic density has the largest effect on parameter space [^. In 
particular, regions where the LSP annihilates efficiently through some particular mechanism 
are preferred by the fits. In the left-hand panel, we see that the highest posterior region is 
where the stau co-annihilation channel is active at the lowest value of rriQ, where the lightest 
stau co-annihilates very efficiently with the lightest neutralino due to their near mass 
degeneracy. Next, in the approximate region 0.5 TeV < mo < 1-5 TeV, there is another 
reasonably high posterior region. In this region, tan P is large and the LSP is approximately 
half the mass of the pseudo-scalar Higgs boson ^4'^. The process XiXi ^ A^ ^^ bb becomes 
an efficient channel in this region. For higher values of mo > 2 TeV, the hyperbolic 
branch [^, ^] regime reigns, where the LSP contains a significant higgsino component and 
annihilation into weak gauge boson pairs becomes quite efficient. This region dominantly 
has tan/? > 10, as can be seen in the right-hand panel of Fig. ^. All of the qualitative 
features of previous MCMC global fits |9|, ^ 0, |l^, |l5| have been reproduced in the figure. 



providing a useful validation of the MultiNest technique in a particle physics context, 
where the shape of the multi-dimensional posterior exhibits multi-modality and curving 
degeneracies. 2-dimensional marginalisations in other mSUGRA parameter combinations 
also agree to a large extent with previous MCMC fits, for both fi > and ^ < 0. However, 
compared to MCMC fits in Refs. ||9|, [I^, 14], there has been a slight migration for fi > 0: the 



stau co-annihilation region has become relatively more favoured than previously and the 
hyperbolic branch has become less favoured. This is primarily due to Myy and sin^ 0^: our 
calculation includes 2-loop MSSM effects and so we are able to place smaller errors on the 
theoretical prediction than Refs. [^, 11, ^. Both of these variables show a mild preference 



for a sizable SUSY contribution once these 2-loop effects are included |67|. The pure 
S0FTSUSY2 .0.17 calculation is at 1-loop order and without the additional two loop effects, 
it displays a preference for larger SUSY scalar masses |12], thus favouring the hyperbolic 



branch region more. An effect in the opposite direction that comes from including the 



NNLO corrections to BR{b -^ S7) is possible |14|. Large values of mo in the hyperbolic 
branch region lead to fairly light charged Higgs' in mSUGRA due to charged Higgs-top 
loops, which may then push the branching ratio toward its experimentally preferred range, 
by adding constructively to the Standard Model contribution. However, our estimate of 
the combined statistical error of BR{b — > 57) in Table ^ means that this effect only has a 
small statistical pull on the fits, being out-weighed by the effects mentioned above in the 
opposite direction. We note here that, as mj as determined from experiment increases, the 



focus point region moves to higher values of mo |68]. However, very similar fits to the ones 
presented here were performed for mt = 172.6 it 1.8 GeV, see Fig. 2a of Ref. [16|, and the 
posterior density on the mo — Afi/2 plane did not change much compared to the present 
paper (which uses mj = 170.9 ± 1.8 GeV). 

For /i < 0, the fit prefers a higher posterior probability for the focus point region 
compared to Ref. [12|. We show the marginalisation of /U < mSUGRA to the mo — wii/2 



*The uneven "bobbly" appearance of the 2d marginalized posteriors is due to the small pixel size used in 
the marginalized grid; this was required in order to resolve the finest features in the posterior distributions 
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Figure 3: The 2-dimensional mSUGRA posterior probability densities in the plane niQ, 1111/2 for 
/i < for (left) the '4 TeV range' linear measure prior analysis and (right) the '4 TeV range' 
logarithmic measure prior analysis. The inner and outer contours enclose 68% and 95% of the total 
probability respectively. All of the other parameters in each plane have been marginalised over. 



plane in Fig. & The left-hand panel shows the linear measure prior analysis and may 
be compared directly with Fig. 5a of Ref. p^, which has the stau co-annihilation region 



having the highest posterior density. The increased discrepancy of (^f — 2)^ in the present 
fit with current data will favour heavier sparticles due to the SUSY contribution being of 
the wrong sign for /i < mSUGRA. In the right-hand side, we see how the fit changes due 
to a logarithmic measure on the prior. Indeed, the foreseen shift toward lower values of mo 
is significant, the stau co-annihilation channel being favoured once more. Although there 
are some similarities with the left-hand panel, it is clear that the choice of prior measure 
still has a non-negligible effect on the fit despite the inclusion of new 6-physics observables. 

With this fact still in mind, we compare the posterior probability density function for 
/i > and /i < in Fig. ^ for linear measure priors. In Fig. ^, we see the preference for 
heavier sparticles in the /u < case reflected in the larger values for the universal scalar and 
gaugino masses rriQ and mi/2- It is clear from the top left hand panel that any inference 
made about scalar masses /i < will be quite sensitive to the exact range taken, since the 
niQ distribution is near its maximum at large values close to 4 TeV. On the other hand, the 
data constrains m.1/2 < 2 TeV robustly, fi < favours large tan/3 less than fi > since for 
large tan/3, [g — 2)^ becomes more negative, with the wrong sign compared to the data. 

As discussed in Section |2|, one can easily obtain the posterior for the observables, 
which are derived from the model parameters, from the posterior of the model parameters. 
Fig. displays the statistical pulls of the various observables. In the absence of any tension 
between the constraints or volume effects, one would expect the posterior curves to lie on 
top of the likelihood curves representing the experimental data used in the analysis (see 
also |1Q]). In order to separate the volume effects from pulls originating from data, the 



likelihood profile could be used [15|. Here though, we just comment on the combined effect 
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Figure 4: Comparison of /x < and /i > 1-diniensional relative posterior probability densities 
of mSUGRA parameters for the linear measure prior '4 TeV range analysis. All of the other input 
parameters have been marginalised over. 



Parameter 


/i> 
68% region 95% region 


/i<0 
68% region 95% region 


my,o (GeV) 


(117,119) (114,121) 


(119,120) (117,121) 


rrijifi (TeV) 


(0.62,2.12) (0.48,3.33) 


(1.08,3.23) (0.75,3.75) 


m,-^ (TeV) 


(1.57,3.79) (0.93,4.47) 


(2.71,4.18) (2.07,4.64) 


m-g (TeV) 


(1.53,2.17) (0.95,3.15) 


(1.75,2.45) (1.11,3.29) 


m^o (TeV) 


(0.19,0.48) (0.11,0.68) 


(0.20,0.52) (0.13,0.70) 


m^± (TeV) 


(0.25,0.86) (0.14,1.22) 


(0.22,0.88) (0.15,1.26) 


me-^ (TeV) 


(0.69,3.34) (0.21,3.91) 


(2.09,3.75) (0.93,3.97) 



Table 6: sparticle mass ranges for linear '4 TeV analysis corresponding to 68% and 95% of posterior 
probability. 

from the two mechanisms. We see that QoAih'^ has a preference for being rather small, 
but non-zero for either sign of fi. Since any value below ^uMh'^ = 0.1143 is not penalised 
by the likelihood penalty we have used, this may be ascribed to a combination of volume 
effects (where there is simply more volume of parameter space with a small relic density) 
and pull toward those region from the other observables. The biggest disparity between 
the experimental data and the posterior probability distribution is observed for the 5a^ 
constraint, which can only be near its central measured value for light sparticles and large 
tan/3. Many of the other constraints are pulling toward large values of the masses, where 
the volume of parameter space is larger, and so small values of \5a^\ are preferred. We 
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Figure 5: An illustration of tensions between different observablcs for the mSUGRA model. The 
black (dash-dotted), red (thin solid) and the blue (thick solid) lines show the relative posterior 
probability for ^ > 0, /i < and the likelihood respectively for each observable. 



see a slight preference for ^ < from the BR{h -^ 57) constraint, as expected from the 
discussion in Section4.2 and Ref. [14|, but this is too small to outweigh the effects of (5a^, 
as shown previously by our estimate of P+/P_. The figure shows that the ratio RAmsi of 
the MSSM prediction of the Bg mass splitting to the SM prediction is really not active, i.e. 
that it does not vary across allowed mSUGRA parameter space, and so does not have an 
effect on the posterior density. 

We list the sparticle mass ranges for linear '4 TeV analysis corresponding to 68% and 
95% of the posterior probability in Table y. 

4.3 Consistency Check between Different Constraints 

It is clear from Fig. |5| that (5a^ and BR{b — > S7), both important observables, are pulling 
in opposite directions. We choose the 'strongly preferred' value of /i > for our analysis. 
In order to check whether the observables {g — 2)^ and BR{b -^ S7) provide consistent 
information on the fj, > branch of mSUGRA parameter space, calculation of the parameter 



R as given in Eq. 2.6 is required. In order to carry out this calculation, we impose linear 
'4 TeV priors. In Fig. p, we plot the posterior probability distributions for the mo — 
mi/2 and rriQ — tan/? planes for the analysis with fioM^^i {9 — 2)^ and BR{b -^ sj) 
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individually. Prom the figure, we see that the 68% probability regions preferred by the 
dttfj, and BR{h -^ S7) data are a little different as expected for // > 0, since 5a^ prefers 
light SUSY particles whereas the BR[b -^ sj) datum prefers heavy ones in the hyperbolic 
branch region. Nevertheless, some overlap in the 95% probability regions favoured by these 
two data-sets. One would then expect the inconsistency between BR{h — > ,37) and {g — 2)^ 
not to be highly significant. We evaluate 

log i? = -0.32 ±0.04, (4.1) 

showing very small evidence for inconsistency between {g — 2)^ and BR{b — > 57). 

Since QoAih'^ plays such a dominant role in shaping the posterior, we next check 
consistency between all three constraints in mSUGRA. We perform the analysis in the 
same manner as described above and evaluate R to be: 

log i? = 0.61 ±0.06, (4.2) 

showing no evidence for inconsistency between {g — 2)^, BR{b — > S7) and JIdm^^- 

These results can be seen qualitatively in the 2-D posterior for the joint analysis of 
{g — 2)^, BR{b — > S7) and Qouh'^ in Fig- %■ It can be seen that the joint posterior lies 
precisely in the region of overlap between posteriors for the analysis of these three data-sets 
separately. As shown in Appendix^, in the presence of any inconsistency between different 
data-sets, the joint posterior can be seen to exclude the high posterior probability regions 
for the analysis with the data-sets separately which is not the case here and consequently 
we do not find a strong evidence for inconsistency between {g — 2)^, BR{b -^ 57) and 
^DMh^ data-sets. 

We now treat all the observables D, apart from (g — 2)^, BR{b -^ S7) and JIdm/i^ as 
additional priors on the mSUGRA parameter space in order to see whether these have any 



effect on the consistency between {g — 2)^ and BR{b -^ S7). Eq. 2.6 then becomes: 



^ ^ Pri{g-2)„BR{b^s^)\B,H,) 

Priig -2)^\I),Ho)Pr{BRib ^ sj)\I),Hoy 

where the Hi hypothesis states that mSUGRA jointly fits the two observables, whereas Hq 
states that the two observables prefer different regions of parameter space. 

Since the measurements Di of the observables used in the likelihood are independent, 

Pr((, - 2),,BRib ^ s,)\ty,Hi) = ^9 - 2),,BRib ^ s,),n\Hi) 

\yy ;m, v ij\ d Pr(D|Fi) ^ ^ 

^ , ~ s Pr((o-2).,D|i7o) , . 

Pr((5 - 2)^|D, Ho) = ^^^^ > ' °^ (4.5) 

FriBRib ^ .7)|D, H^) = ^^^b ^Jl),t>\Ho) 

where Pr(D|//o) = Pi'(D|/7i) is the Bayesian evidence for the analysis of /U > branch of 
mSUGRA model with D, all the observables apart from (5 — 2)^, BR{b — > 57) and Q.-Qyih? . 
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Figure 6: The 2-diniensional posterior probability distributions of /i > branch of mSUGRA 
with: from top to bottom, r^DM/i^, BR(b —+ 57), 5a^, and joint analysis of all three. The inner and 
outer eontours enclose 68% and 95% of the total probability respectively. All of the other input 
parameters in each plane have been marginalised over. 



Hence, to evaluate R, we calculate the Bayesian evidence for the joint as well as individual 
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analysis with D, (g — 2)^ and BR{b -^ sj). We evaluate 

logi? = 0.28 ±0.15, (4.7) 

showing that even the slight inconsistency found between (g — 2)^, BR{b -^ sj) without 
treating D as additional priors on mSUGRA model, has now vanished which means that 
D data-sets have cut-off the discrepant regions of the two constraints. 

5. Summary and Conclusions 



Bayesian analysis methods have been used successfully in astronomical applications |69, 70 



71,|7^, |7^, 74, 75, 76, 77, 30|. However, the application of Bayesian methods to problems in 
particle physics is less established, due perhaps to the highly degenerate and multi-modal 
parameter spaces which present a great difficulty for the standard MCMC based techniques. 
Bank sampling |27] provides a practical means of MCMC parameter estimation and evi- 



dence ratio estimation under such circumstances, but it cannot calculate the evidence itself. 
We have shown that the MultiNest technique not only handles these complex distribu- 
tions in a highly efficient manner but also allows the calculation of the Bayesian evidence 
enabling one to perform the model comparison. This could be of great importance in dis- 
tinguishing different beyond the Standard Model theories, once high quality data from the 
LHC becomes available. 

Our central results are summarised in Table |5[ It is clear that, in global mSUGRA 
fits to indirect data, /i > is somewhat preferred to ^u < 0, mainly due to data from 
the anomalous magnetic moment of the muon, which outweighs the preference for ^ < 
from the measured branching ratio of a 6 quark into an s quark and a photon and the SM 
prediction when some of the NNLO QCD contributions are included. For a given measure 
and range of the prior, the evidence ratio between the different signs of ^ is accurately 
determined by the MultiNest technique. Despite additional data from the 6— sector 
and the anomalous magnetic moment of the muon having a higher discrepancy with the 
Standard Model prediction, there is still not enough power in the data to make the fits 
robust enough. We see a signal of this in the fact that the evidence ratio P+/P_ is highly 
dependent upon the measure and range of the prior distribution of mSUGRA parameters. 
We obtain P+/P_ =6 — 61 depending upon which range and which measure is chosen. All 
of these values exhibit positive evidence, but on the scale summarised in Table ||, 'weak' 
evidence is characterised as being bigger than 3, 'moderate' as bigger than 12. Thus we 
cannot unambiguously conclude that the evidence is strongly in favour of ^ > 0: only weak. 
A further test also suggested that within one prior measure and range, and for /i > 0, the 
tension between the observables {g — 2)^ and BR{b -^ S7) is not statistically significant. 

A. Consistency Check with Bayesian Evidence 

In order to motivate the use of Bayesian evidence to quantify the consistency between 
different data-sets as discussed in Section g, we apply the method to the classic problem 
of fitting a straight line through a set of data points. 
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A.l Toy Problem 

We consider that the true underlying model for some process is a straight line described 
by: 

y{x) = mx + c, (A.l) 

where m is the slope and c is the intercept. We take two independent sets of measurements 
Di and D2 each containing 5 data points. The x value for all these measurements are 
drawn from a uniform distribution U(0,1) and are assumed to be known exactly. 

A. 1.1 Case I: Consistent Data-Sets 
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Figure 7: Upper left: Data-sets Di and D2 drawn from a straight line model (solid line) with 
slope m — 1 and intercept c = 1 and subject to independent Gaussian noise with root mean 
square ai = (T2 = 0.1. Upper right: Posterior Pr(m, c|D, i/i) assuming that data-sets Di and 
D2 are consistent. Lower left: Posterior Pr(TO, c|D, iJi) for data-set Di. Lower right: Posterior 
Pr(m, c|D,iJi) for data-set D2. The inner and outer contours enclose 68% and 95% of the total 
probability respectively. The true parameter value is indicated by red crosses. 

In the first case we consider m = 1, c = 1 and add Gaussian noise with standard 
deviation ui = 0.1 and o"2 = 0.1 for data-sets Di and D2 respectively. Hence both the 
data-sets provide consistent information on the underlying process. 
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Pr(A|m, c, H) = -^L= exp[-x?/2] (A.3) 



We assume that the errors o"i and o"2 on the data-sets Di and D2 are known exactly. 
The hkehhood function can then be written as: 

C{m, c) = Pr(D|m, c,H) =^1 Pr(Di|m, c, H), (A.2) 

i 

where 

FriDi\m,c,H) = — 

27ra; 

^^=Y^M^^-f^. (A.4) 

j * 

where y{xj) is the predicted value of y at a given Xj. 

We impose uniform, L{{0,2) priors on both m and c. In Fig. |7| we show the data 
points and the posterior for the analysis assuming the data-sets Di and D2 are consistent. 
The true parameter value clearly lies inside the contour enclosing 68% of the posterior 
probability. 

In order to quantify the consistency between the data-sets Di and D2, we evaluate R 



and 



as given in Eq. 2.6 which for this case becomes: 



Pr(D„D2|i/i) 
Pr(Di|/7o)Pr(D2|i/o)' ^ ' 

where the Hi hypothesis states that the model jointly fits the data-sets Di and D2, whereas 
Hq states that Di and D2 prefer different regions of parameter space. We evaluate, 

logi? = 3.2±0.1, (A.6) 

showing strong evidence in favour of Hi . 

A. 1.2 Case II: Inconsistent Data-Sets 

We now introduce systematic error into the data-set Di by drawing from an incorrect 
straight line model with m = and c = 1.5. Measurements for D2 are still drawn from a 
straight line with m = 1 and c = 1. We assume that the errors ai = 0.1 and <T2 = 0.1, for 
Di and D2 respectively, are both quoted correctly. 

We impose uniform priors, W(— 1,2) and U{0,2), on m and c respectively. In Fig. g 
we show the data points and the posterior for the analysis assuming the data-sets Di and 
D2 are consistent as well as for the analysis with data-sets Di and D2 taken separately. 
In spite of the fact that the two sets of true parameter values define a direction along the 
natural degeneracy line in the (m, c) plane, neither of the true parameter values lie inside 
the contour enclosing 95% of the posterior probability. Also, it can be seen that the there 
is no overlap between the posteriors for data-sets Di and D2 and so both models can be 



excluded at a high significance level. We again compute R as given in Eq. |A.5| and evaluate 
it to be, 

logi? = -13.1 ±0.1, (A.7) 

showing evidence in favour of Hq i.e. the data-sets Di and D2 provide inconsistent infor- 
mation on the underlying model. 
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Figure 8: Upper left: Data-sets Di and D2 drawn from a straight line model (solid line) with 
slope m = 0, c ~ 1.5 and m — 1, c = 1 respectively and subject to independent Gaussian noise with 
root mean square ai — a2 — 0.1. Upper right: Posterior Pr(m, c|D, TJi) assuming that data-sets 
Di and D2 are consistent. Lower left: Posterior Pi{m,c\T),Hi) for data-set Di. Lower right: 
Posterior Pr{m,c\'D,Hi) for data-set D2. The inner and outer contours enclose 68% and 95% of 
the total probability respectively. The true parameter values are indicated by red and black crosses 
for Data-sets Di and D2 respectively. 
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